Jan-Mar © 2014 Transactions; 1(1): 17-22 


Research Article 


DRIVEN BY 


A Ranking Model Framework for Multiple Vertical Search Domains 


Dr. Shoban Babu Sriramoju ! and Ramesh Gadde ? 


Corresponding Author: 
babuack@yahoo.com 


DOI: 
http://dx.doi.org/ 
10.17812/TJRA.1.1(4)2014 


Manuscript: 

Received: 24 March, 2014 
Accepted: 04t April, 2014 
Published: 23"4 April, 2014 


ABSTRACT 


Huge information can be obtained from vertical search domains. 
Often the information is very large in such a way that users need to 
browse further to get the required piece of information. In this 
context ranking plays an important role. When ranking is required 
for every domain, it is tedious task to develop ranking algorithms 
for every domain explicitly. Therefore it is the need of the hour to 
have a ranking model that can adapt to various domains implicitly. 
Recently Geng et al. proposed an algorithm that has a ranking 
model which can adapt to various domains. This avoids the process 
of writing various algorithms for different domains. In this paper 
we built a prototype application that implements the algorithm to 
provide ranking to the search results of various domains. The 
experimental results reveal that the algorithm is effective and can 
be used in the real world applications. 
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ABSTRACT 


Huge information can be obtained from vertical search domains. Often the information is very large in such a 
way that users need to browse further to get the required piece of information. In this context ranking plays an 
important role. When ranking is required for every domain, it is tedious task to develop ranking algorithms for 
every domain explicitly. Therefore it is the need of the hour to have a ranking model that can adapt to various 
domains implicitly. Recently Geng et al. proposed an algorithm that has a ranking model which can adapt to 
various domains. This avoids the process of writing various algorithms for different domains. In this paper we 
built a prototype application that implements the algorithm to provide ranking to the search results of various 
domains. The experimental results reveal that the algorithm is effective and can be used in the real world 


applications. 
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1. INTRODUCTION 
Internet has become information super highway 
which provides plethora of information of various 
domains. Internet has been around and being used to 
add more information to it from various quarters of 
the world. At the same time information retrieval has 
been very useful from Internet that led people to 
obtain required information. Vertical domains, of 
late, are helping people 
information with ease. However, the search engines 


to obtain required 
are returning huge amount of information that make 
the end users to spend more time to pick the right 
information that they need. This caused problems to 
end users. In order to overcome this problem, many 
ranking models came into existence. The ranking 
models include SVM [5], [6], RankNet [3], RankBoost 
[4], LambdaRank [1] and so on. These algorithms 
helped to obtain information that can help users 
immediately. 


There are search engines that are domain specific 
which are moved from broad based searches to 
domain specific searches in order to provide vertical 


search information to end users. Such search engines 
are very useful to end users by returning required 
documents. Thus many search engines came into 
existence that can serve images, music and video. 
They act on various documents of different types and 
formats. 


Ranking models are used by broad based search 
engines with various techniques. They use Term 
Frequency (TF) for ranking the results. However, the 
broad based ranking models provide information of 
any kind with ranking. They are much generalized 
ones that can’t provide domain specific results. When 
ranking models are required for vertical domains, 
many models are needed in order to serve all 
domains. This will be very cumbersome and time 
consuming to do so. Therefore it is very much 
required to have an algorithm that can adapt to 
various domains in the real world. 


From the experiment in the real world it is 
understood that the broad based search engines can 


provide information that cannot be useful 
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immediately unless user browses for more accurate 
information. To overcome this problem, ranking 
adaptation to new domains is required by the 
algorithms of classifiers. In this context research was 
found in the literature as explored in [7], [8], [9], [10], 
and [11]. However, it is new research for adapting to 
new domains i.e. ranking model adaptation. Concept 
drifting [12] and classifier adaptation existed in the 
literature. The former is related to predicting 
rankings while the latter deals with binary targets. 
These classifiers have some problems as they could 
not adapt to new vertical domains. 


Recently Geng et al. [14] proposed a new ranking 
adaptation algorithm that helped in domain specific 
search. One algorithm can adapt to various domains 
and help in all domains instead of labeled data. The 
effective adaption of ranking models helped the 
who need diverse 
Their 
algorithm addressed all the problems and made the 


algorithm to serve users 


information from across the domains. 
domain specific search successful with a single 
ranking model as it can adapt to new domains. In 
this paper we implemented that model and tested it 
practically. The empirical results are encouraging. 
The remainder of the paper is structured into 
sections. Section 2 focuses on review of literature. 
Section 3 provides information about the new 
ranking adaptation model. Section 4 gives details 
about the experiments, results and evaluation while 
section 5 provides conclusions. 


2. RELATED WORK 


In the literature many researches were found on the 
ranking models that helped to rank returned results 
from search engines. However, many of the 
researches were focusing on general information 
retrieval and not specific to domains. When they are 
specific to domains, they could not be able to adapt 
to new domains. Language models for retrieval of 
[14] and Classical BM25 [15] 


worked well. When few parameters are to be 


information [13], 


adjusted and obtain results, these models worked 
fine. However, they were inadequate to adapt to new 
domains for the purpose of ranking. For this reason it 
is understood from the review of literature that there 
was a need for making a new ranking model which 
could adapt to different domains spontaneously 
without causing problems. Before presenting our 
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ranking model adaptation concept, we would like to 
review the previous models. 


Of late, ranking algorithms came into existence that 
helped information retrieval with ranking that could 
let users to get more useful results at the top of the 
search results. The ranking problem was converted 
to classification problem by some of the algorithms. 
The examples of such algorithms include 
LambdaRank [1], ListNet [2], RankNet [3], 
RankBoost [4], and Ranking SVM [5], [6] etc. These 
algorithms have single objective that is optimizing 
search results. Recently Gent et al. built a new 
ranking model adaptation algorithm that helps in 
ranking the search results of any domain. This 
reduced the need for developing a new ranking 
model for each domain. Some other ranking related 
models were developed in [8], [7], [16], and [10]. In 
this paper we implemented the model developed by 
Geng et al. [14]. 


RAKING ADAPTATION 


In this section we provide details regarding the 
ranking model that has been implemented in this 
paper. Before that we describe the problem statement 
here. Let set of queries to be Q and set of documents 
to be D. Human annotators label the search results. 
The other ranking models studied include PageRank 
[18] and HITS [17]. Estimating ranking is the purpose 
of these algorithms. The documents retuned by them 
will be small and they depended on the prior 
knowledge in the form of labels and training 
samples. 


Ranking Adaptation SVM 


In this paper we have an assumption that the ranking 
target domain and auxiliary domain are smaller and 
it is quite possible to have a ranking model that can 
domains. The 
frameworks that are conventional like Neural 
Networks [20] and SVM [19] have some problem in 
This 
problem was named as ill-posed problem that needs 


adapt to various regularization 


obtaining results with various domains. 
prior assumption and thus not suitable for ranking 
model adaptation. Therefore we use regularization 
framework in order to solve this problem elegantly 


with the help of an adaptive ranking function. 
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The ranking adaptation model which has been 
proposed is as follows. 


i = fan 8 ri 
min= FAH ER, POr|f-PIP+C Dex 


s.t. f(a(qi. d;))— felqi d;))> l-€jk €a.> 0. for 
Ei e {1.2........M}. BjEk €{1.2.....n(qi)} with yij > yik 


Adapting to Multiple Domains 


The algorithm proposed in this paper can be 
extended for multiple domains that can be used in 
the ranking model adaptation. New domains are 
learned on the fly by the same ranking model in the 
process of adaptation. Thus the proposed system 
supports domain specific search with only one 
ranking model. The multiple domain adaptation can 
be formulated as follows with an assumption that 
certain auxiliary functions help the system to adapt 
to new domains. 


s.t. f(a(qi. dy) — fela. dj) > lÆ €> 0, for 


Ei e {1.2........M}. BjEk e{1.2.....n(qi)} with yij > yik 


As expected in this paper, the data is from various 
domains that have different features related to the 
corresponding domains. The ranking model we 
implement practically adapts to those features. The 
usage of the domain specific features and adapt to 
new domains make this model useful and avoid the 
necessity of making so many ranking models. We 
also considered the ranking loss concept into the 
framework. Document similarities concept is used in 
the framework. The margin rescaling is also 
considered as an optimization problem for rescaling 
the margin violations if any with respect to 
adaptability. The same is presented as follows. 


s.t. f(&(qi; dy) - f(e(qi. dy)) > 1- Ejr-0x €jx> 0, for 
Wie {1.2,......M}, Blk e{1.2,....n(q)} with yij > yik 
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In the same fashion slack rescaling is formulated as 
follows: 


s-t. f(@(qi. dy) — f(e(qi. dy) > 1- Ej-om €jL> 0, for 


mefa -M}. BjEk e{1.2.....n(qi)} with yij > yik 


3. EXPERIMENTAL RESULTS 


We built a prototype application with web based 
interface to demonstrate the proof of concept. The 
application is built using Java/JEE platform that used 
JDBC, Servlets and JSP. The environment includes a 
PC with 2 GB RAM with Pentium Core 2 Dual 
processor is used for experiments. We used datasets 
such as TD2004 and TD 203 which were gathered 
from Internet sources. The performance of the model 
is measured using cumulative gain and means 
average precision. The results are compared with 
other baseline methods such as Aux-Only, Tar-Only, 
and Lin-Comb. 


NDCG 


E TAR-Only 
BAUX-Only 
a lintomd 


BRASVM 


Truncation Level 


Fig.1.TD200 to TD2004 adaptation with five queries 


As can be seen in fig. 1, comparison is made with 
adaptation performance of proposed algorithm with 
other three algorithms. The adaptation is from 
TD2003 dataset to TD2004 dataset with five queries. 
Out of all, the proposed algorithm has shown best 
performance. When Aux-Only is compared with Tar- 
Only, the Aux-Only outperforms the other model. 
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Fig.2.TD2003 to TD2004 adaptation with ten queries 


As can be seen in fig. 2, comparison is made with 
adaptation performance of proposed algorithm with 
other three algorithms. The adaptation is from 
TD2003 dataset to TD2004 dataset with ten queries. 
Out of all, the proposed algorithm has shown best 
performance. As number of queries is increased, 
when Aux-Only is compared with Tar-Only, the Tar- 
Only outperforms the other model. 
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Fig. 3. NDCG Results of web page search to image 
search adaptation with five labeled queries 


As can be seen in fig. 3, adaptation is made from web 
page search to image search with five labeled 
queries. The performance of proposed model is 
compared with other baseline models. The proposed 
model outperforms other models. 
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Fig. 4. NDCG Results of web page search to image 
search adaptation with ten labeled queries 


As can be seen in fig. 4, adaptation is made from web 
page search to image search with ten labeled queries. 
The performance of proposed model is compared 
with other baseline models. The proposed model 
outperforms other models. 
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Fig. 5. NDCG Results of web page search to image 
search adaptation with twenty labeled queries 


As can be seen in fig. 5, adaptation is made from web 
page search to image search with twenty labeled 
queries. The performance of proposed model is 
compared with other baseline models. The proposed 
model outperforms other models. 
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Fig. 6. NDCG Results of web page search to image . 
search adaptation with thirty labeled queries 


As can be seen in fig. 6, adaptation is made from web 
page search to image search with thirty labeled 
queries. The performance of proposed model is 
compared with other baseline models. The proposed 
model outperforms other models. 


4, CONCLUSION 


In this paper we studied information retrieval 
systems such as search engines. We came to know 
that the search engines are broad based search 
engines and they can’t provide domain specific 
information. However, there are certain domain 
specific search engines that have problem to adapt to 
new domains. These search engines provide huge 
amount of information that makes the user not 
happy. The reason behind this is that the end user 
has to browse and spend some time to have exactly 
required information. This has been improved a lot 
by various ranking models. However, those ranking 
models could not adapt to new domains. In this 
paper we implemented a ranking model that can 
adapt to new models and thus avoid making many 
algorithms for each domain. Our model is based on 
the work done by Geng et al. [14] we built a 
prototype application that demonstrated the proof of 
concept. The empirical results are encouraging. 
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